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Abstract. We propose a wide class of preferential attachment models of 
random graphs, generalizing previous approaches. Graphs described by 
these models obey the power-law degree distribution, with the exponent 
that can be controlled in the models. Moreover, clustering coefficient 
of these graphs can also be controlled. We propose a concrete flexible 
model from our class and provide an efficient algorithm for generating 
graphs in this model. All our theoretical results are demonstrated in 
practice on examples of graphs obtained using this algorithm. Moreover, 
observations of generated graphs lead to future questions and hypotheses 
not yet justified by theory. 
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1 Introduction 

Numerous random graph models have been proposed to reflect and predict im- 
portant quantitative and topological aspects of growing real-world networks, 
from Internet and society |l|4|7j to biological networks [22] . Though largely suc- 
cessful in capturing their key qualitative properties, such models may lack some 
important characteristics. An extensive review can be found elsewhere (e.g., 
see [11415] ). Such models are of use in experimental physics, bioinformatics, in- 
formation retrieval, and data mining. 

The simplest characteristic of a vertex in a network is the degree, the number 
of adjacent edges. Probably the most important and the most extensively studied 
property of networks is their vertices degree distribution. For the majority of 
studied real-world networks, the portion of vertices with degree d was observed 
to decrease as d -7 , usually with 7 > 2, see [31418114] , Such networks are often 
called scale-free. 

Another important characteristic of networks is their clustering coefficient, 
a measure capturing the tendency of a network to form clusters, densely inter- 
connected sets of vertices. Various definitions of the clustering coefficient can 
be found in the literature, see [5] for a discussion on their relationship. We de- 
fine the clustering coefficient of a graph G as the ratio of the triple number 



of triangles to the number of pairs of adjacent edges in G. For the majority 
of networks, the clustering coefficient varies in the range from 0.01 to 0.8 and 
does not change much as the network grows [3J. Modeling real- world networks 
with accurately capturing not only their power-law degree distribution, but also 
clustering coefficient, has been a challenge. We discuss this in detail in Section[2] 

In order to combine tunable degree distribution and clustering in one model, 
some authors [20 21 22 proposed to start with a concrete prior distribution of 
vertex degrees and clustering and then generate a random graph under such 
constraints. However, adjusting a model to a particular graph seems to be not 
generic enough and can be suspected in "overfitting" . A more natural approach 
is to consider a graph as the result of a random process defined by certain reason- 
able realistic rules guaranteeing the desired properties observed in real networks. 
Perhaps the most widely studied realization of this approach is preferential at- 
tachment. In Section [2j we give a background on previous studies in this field. 

In this paper, we propose a new class of preferential attachment random 
graph models thus generalizing some previous approaches. We provide theoretical 
study, proving the power law for the degree distribution and approximating 
the clustering coefficient of the resulting graphs. We also propose an efficient 
algorithm realizing a concrete flexible model from our class with tunable both the 
power-law exponent and the clustering coefficient. All our theoretical results are 
verified experimentally with utilization of our algorithm. Moreover, observations 
of generated graphs lead to future questions and hypotheses not yet justified 
by theory. 

The remainder of the paper is organized as follows. In Section [2j we give a 
background on previous studies of preferential attachment models. In Section [3j 
we propose a definition of a new class of models, and obtain some general re- 
sults for all models in this class. Then, in Section [4j we describe one particular 
model from the proposed class and provide an efficient algorithm, which gener- 
ates graphs in this model. We demonstrate results obtained by generating graphs 
in this model in Section [5] Section [6] concludes the paper. 

2 Preferential Attachment Random Graph Models 

In 1999, Barabasi and Albert observed [3] that the degree distribution of the 
World Wide Web follows the power law with the exponent approximated by —2.1. 
As a possible explanation for this phenomenon, they proposed a graph construc- 
tion stochastic process, which is a Markov chain of graphs, governed by the 
preferential attachment. At each time step in the process, a new vertex is added 
to the graph and is joined to m different vertices already existing in the graph 
chosen with probabilities proportional to their degrees. 

Denote by d™ the degree of the vertex i in the growing graph at time n. For 
at each step m edges are added, we have d" = 2mn. This observation with 
the preferential attachment rule imply that 

PK l+1 = d + iK = d) = ^-. (i) 
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Note that the condition ([T]) on the attachment probability does not specify the 
distribution of m vertices to be joined to, in particular their dependence. There- 
fore, it would be more accurate to say that Barabasi and Albert proposed not a 
single model, but a class of models. As it was shown later, there is a whole range 
of models that fit the Barabasi-Albert description, but possess very different 
behavior. 

Theorem 1 (Bollobas, Riordan |5j). Let f(n), n > 2, be any integer valued 
function with f{2) = and f(n) < /(n+1) < /(n) + l for every n>2, such that 
fin) — > oo as n — ¥ oo. Then there is a random graph process T(n) satisfying 
such that, with probability 1, T(n) has exactly f(n) triangles for all sufficiently 
large n. 

In [BJ, Bollobas and Riordan proposed a concrete precisely defined model 
of the Barabasi-Albert type, known as the LCD-model, and proved that for 
d < nil, the portion of vertices with degree d asymptotically almost surely 
obeys the power law with the exponent —3. Recently Grechnikov substantially 
improved this result |15] and removed the restriction on d. It was shown also 
that the expectation of the clustering coefficient in the model is asymptotically 

proportional to ( loj ^") and therefore tends to zero as the graph grows [5]. 

One obtains a natural generalization of the Barabasi-Albert model, demand- 
ing the probability of attachment of the new vertex n + 1 to the vertex i to be 
proportional to <i™ + /?, where /3 is a constant representing the initial attractive- 
ness of a vertex. Buckley and Osthus [9 j proposed a precisely defined model with 
a positive integer (3. Mori [T7] generalized this model to real j3 > — 1. For both 
models, the degree distribution was shown to follow the power law with the expo- 
nent — (3 + 0) in the range of small degrees. The recent result of Eggemann and 
Noble [13] implies that the expectation of the clustering coefficient in the Mori 
model with /? > is asymptotically proportional to p or p = q, the Mori 

model is almost identical to the LCD-model. Therefore authors of [XT] emphasize 
the confusing difference between clustering coefficients ( ^ ios " versus ^p 1 )- 

The main drawback of the described preferential attachment models is unre- 
alistic behaviour of the clustering coefficient. In fact, for all the proposed models 
the clustering coefficient tends to zero as a graph grows, while in the real-world 
networks the clustering coefficient is approximately a constant [3]. 

A model with asymptotically constant clustering coefficient was proposed by 
Holme and Kim [IB]. However, experiments and empirical analysis show that the 
degree distribution in this model obeys the power law with the fixed exponent 
close to —3, which does not suit most real networks. 

In the next section, we propose a new class of preferential attachment models 
with the power-law degree distribution. Further we consider a particular model in 
this class, which allows to tune both the power-law exponents and the clustering 
coefficient by varying its parameters. 
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3 Theoretical Results 



In this section, we define a general class of preferential attachment models. For 
all models in this class we are able to prove the power-law degree distribution. 
We also estimate the number of pairs of adjacent edges in models from this 
class and therefore can analyze the behavior of the clustering coefficient as the 
network grows. 



3.1 Definition of the PA-class 

Let GJ^ be a graph with n vertices {1, . . . , n} and mn edges obtained as result of 
the following random graph process. We start at the time uq from an arbitrary 
graph GJ^ with n vertices and mn edges. On the [n + l)-th step (n > n ), 
we make the graph GJ^ 4 " 1 from G^ by adding a new vertex n + 1 and m edges 
connecting this vertex to some m vertices from the set {I, . . . ,n,n + I}. Denote 
by g?" the degree of a vertex i in GJ^. If for some constants A and B the following 
conditions are satisfied 

P « +1 = <%\Gl) = 1 - a£ - B^ + O ( ] , (2) 

P(d'l +1 =d'l + l\G'; n )=A d l+B^ + o{'^--) . (Si 



P < +1 = d? + j\G n m ) =OPf ,2<j<m, (4) 





P(d£l = m + j) = 0(~), l<j<m, (5) 



then we say that the random graph process G 7 ^ is a model from JM-class. 
Condition ([5| means that the probability to have a loop in the vertex is small. 

Since we add m edges at each step, we obtain 2mA + B = m (summing up 
the equality ^ over all the vertices). Furthermore, we have < A < 1, since 
the probabilities defined in (2|5) must be positive for all d™ > m and all n. 



Here we want to emphasize that we indeed defined not a single model but 
a class of models. In fact, there is a range of models possessing very different 
properties and satisfying the conditions (2|5). For example, the LCD-model be- 



longs to PA-cla,ss with A = 1/2 and B = 0. The Buckley-Osthus one is also 
from PA-class with A = and B = Another example is considered in 
detail in Sections 0] and [5] This situation is somewhat similar to that with the 
definition of the Barabasi-Albert models, though our class is wider in a sense 
that the exponent of the power-law degree distribution is tunable. 

In mathematical analysis of network models, there is a tendency to consider 
only fully defined models. In contrast, we provide results about general properties 
for the whole Pj4-class in the next two subsections. 
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3.2 Power Law Degree Distribution 

Despite the precise distribution of vertices to be joined to is not fixed in PA-class, 
we are still able to obtain the following results on the degree distribution. 

First, we estimate N n (d), the number of vertices with given degree d in G^. 
Denote by 9(X) an arbitrary function such that |#(A)| < X. We prove the 
following result on the expectation EN n (d) of N n (d). 

Theorem 2. Fix m,A,B and put 6 = 2 + -4. There exists a constant C > 
such that for any d> m we have EN n (d) = c(m,d) (n + 6 (Cd y\ , where 

r(rf+f)r(rn + g±i) r(m + gp)d-i-* 

1 ' ] " Ar( d + ^±i)r(m + f) ~ Ar(m + f) 

and r(a;) is the gamma function. Secondly, we show that the number of vertices 
with given degree d is highly concentrated around its expectation. 

Theorem 3. For any 5 > there exists a function (p(n) = o(n) such that for 

A-S 

any m < d < n 4A + 2 

Km x P(\N n (d)-EN n (d)\> 0^=0. 

These two results mean that the degree distribution follows (asymptotically) the 
power law with the parameter — 1 — -r. 

Proving Theorem [2j we use the induction on d and n. We use Azuma- 
Hoeffding inequality to prove concentration. The complete proofs of these theo- 
rems are placed in Appendix due to space constraints. 



3.3 Clustering Coefficient 

Here we consider the clustering in Pj4-class. Results for some classical prefer- 
ential attachment models (LCD and Mori) can be found in Section [5] In both 
models the clustering coefficient tends to zero as the graph grows. 

Usually the asymptotic value of the clustering coefficient can be estimated 
by taking three times the quotient of the expected number of triangles and the 
expected number of p2's since one can prove that both random variables are 
highly concentrated around their expectations. 

In particular, for the LCD-model, the expected number of triangles turns out 
to be of order (logn) 3 and the expected number of p2's is of order nlogn. In 
Mori model for f3 > 0, these quantities are of order log n and n respectively. 

We see that for the two different models, the results are quite different. Here 
we generalize the mentioned results. First, we study the random variable P2(n) 
equal to the number of Pj's in a random graph G™ 4 from an arbitrary model that 
belongs to the PA-class. In the theorems below, we use the following notation. 
By whp ( "with high probability" ) we mean that for some sequence A n of events, 
P{A n ) — > 1 as n — > oo. We write a n ~ b n , provided a n = (1 + o(l))6„, and we 
write a n oc b n , provided Cob n < a n < C\b n for some constants Go, C\ > 0. 
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Theorem 4. For any model from the PA-class, we have 

m(m — 1) 



(1) if2A < 1, then whp P 2 (n) ~ [2m(A + B) + 

(2) if2A = 1, then whp P 2 {n) ~ (2m(A + B) + m( "" 1] ) nlog(n) 
f3j iflA > 1, ifcen tu/ip P 2 (n) cx n 2A . 



2 I 1-2A 

m - 
~2~ 



The ideas behind the proof of Theorem [4] are given in Appendix. Here it is 
worth noting that the value P2(n) in scale- free graphs is usually determined by 
the power-law exponent 7. Indeed, we have 



P 2 (n) = ^ N n (d) „ ^ 

d=l d=l 

Therefore if 7 > 3, then P2(n) is linear in n. But if 7 < 3, then P 2 (n) is 
superlinear. 

Next, we study the random variable T(n) equal to the number of triangles 
in G^. Note that in any model from the PA-class we have T(n) = 0(n) since 
at each step we add at most m( -"~ 1 ' 1 triangles. If we combine this fact with the 
previous observation, we see that if 7 < 3, then in any preferential attachment 
model (in which the out-degree of each vertex equals m) the clustering coefficient 
tends to zero as n grows. 

Our aim is to find models with constant clustering coefficient. Let us consider 
a subclass of PA-class with the following property: 

Jl /d n d n \ 

p(^ = < + l ) d»+ 1 =^ + l|G«)=e y — + O^J . (6) 

Here e%j is the number of edges between vertices i and j in G 7 ^. 

Theorem 5. Let G 7 ^ satisfy the condition Then whp T(n) ~ Dn . 

The proof of this theorem is straightforward. The expectation of the number 
of triangles we add at each step is D + o(l). Therefore ET(n) = Dn + o{n). 
Azuma-Hoeffding inequality can be used to prove concentration. 

As a consequence of Theorems [4] and [5j we get the following result on the 
clustering coefficient C(n) of the graph G 7 ^. 

Theorem 6. Let G 7 ^ belongs to PA-class and satisfy the condition Q). Then 
(1) IflA < 1 then whp C(n) ~ -. 3(1 ~ 2A i? m m . 

3D 



(2) IflA = \ then whp C(ri) ~ -, r- . 

(2) If2A>l then whp C{n) cx n^ 2A . 

In the next section we propose a concrete flexible model from the PA-class 
and provide an efficient algorithm which generates graphs in this model. 
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4 Polynomial Model 



In this section, we consider polynomial random graph models, which belong to the 
general PA-class defined above. In Subsection |4.1| we give the definitions. We 
propose an efficient algorithm, which generates polynomial graphs, in Subsec- 
tion |4.2| In Subsection |4.3[ we find the relations between the polynomial model 
parameters and the ones of PA-class. Applying our above theoretical results to 
polynomial models, we find them to be very flexible: one can tune the parameter 
of the degree distribution, the clustering coefficient and other characteristics. 



4.1 Definition of Polynomial Model 

Let us define the polynomial model. As in the random graph process from |3.1| 
we construct a graph G™j step by step. On the (n + l)-th step the graph G^ +1 
is made from the graph by adding a new vertex n + 1 and edges e%, . . . , e m 
connecting this vertex to some m vertices i\, . . . , i m € {1, . . . , n + 1}. Some of 
it,... ,i m can be equal, so multiple edges are permitted. Also if ij = n + 1 for 
some j, then we obtain a loop (multiple loops are also permitted). 

We say that an edge ij is directed from i to j if i > j, so the out-degree of 
each vertex equals m. We also say that j is a target end of ij. Denote by (df) m 
the in-degree of a vertex i in G^. Remind that by we denote the number of 
edges between vertices i and j. 

Fix some k, I such that < k < m/2 and 2k < I < m. Put 



M(n,m,h,...,i m ,k,l) = - — , L_| TT Yl 

(n + 1 ) m 1 - tJ - mn - LJ - mn 

v ' x=l y=2k+l 

This is a monomial depending on I d™ \ and ei 2x i 2x _ 1 . It is easy to see that for 



each monomial we have 2^=1 • ■ • S™ + =i M(n, m, i\, . . . , i m , k, I) = 1. 

Consider any positive a(k,l) such that X)fcz a (^;0 = 1- Define the polyno- 
mial ; a(k, l)M(n, m, i\, . . . , i m , k, I). Note that the sum of all values of this 
polynomial over all i\, . . . , i m equals 1. Therefore we can put 

P (edges {ei, . . . , e m } go to vertices {i\, . . . , i m } respectively) = 

m/2 m 

= '^2'^2 a kjM(n, m,ix,...,i m , k, I) . (7) 

k=0 l=2k 

This random graph process defines the polynomial model. This model belongs 
to PA-class. Indeed, one can formally show by simple calculations that the con- 
ditions (2|5) hold for this model. We consider the construction of the polynomial 



model in detail in the following two subsections. 

Many models are special cases of the polynomial model. If we take the poly- 

nomial riy=i — 2mn — ' then we obtain a model, which is practically identical 
to the LCD-model. The Buckley-Osthus model can be also interpreted in terms 
of the polynomial model. 
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4.2 Generating Algorithm 

Let us present an algorithm to efficiently generate graphs in the polynomial 
model. When we add the n + 1-th vertex we do the following: 

Step 1 With probability proportional to {ctk i} we choose some k = fco and 
l = h- 

Step 2 We choose k edges from the existing graph G 7 ^ uniformly and indepen- 
dently and take all the ends of these edges as vertices {ix, . . . , i2k a }- 

Step 3 We choose {lo — 2ko) edges from the existing graph G^ uniformly and in- 
dependently and take target ends of these edges as vertices {i2fc +i, . . . , ii }. 

Step 4 Vertices {ii , . . . ,i rn } are chosen randomly, mutually independently, and 
with equal probabilities from the vertices {l,...,n+l}. 
(Note that one vertex can be chosen several times at the Steps 2-4) 

Step 5 We construct G^ +1 by adding to G 1 ^ a new vertex n + 1 with m edges 
going to the vertices {{%,..., i m }- 

Now, it is clear why the monomials from Section |4.1| might be considered 
as the probabilities of some events. Indeed, the first factor in the definition of 
M(n, m,ii, . . . , i m , k 7 1) corresponds to Step 4 of the algorithm; the second factor 
corresponds to Step 2; the third factor corresponds to Step 3. 

Note that Step 2 of our algorithm is mainly responsible to the formation of 
triangles in our random graph. In [16 Holme and Kim propose another proce- 
dure for producing triangles. Namely, they choose edges to form triangles with 
probabilities proportional to the degree of the edge head. One can show that 
this fact pushes the Holme-Kim model out of the PA-class. 

Our algorithm allows us to generate graphs in the polynomial model with 
0(n) complexity. The estimation of the algorithm complexity is straightforward 
assuming we can choose an arbitrary edge or vertex with O(l) complexity. To 
this end, we support an array of edges and vertices at each step. Thus, in order 
to choose an arbitrary edge or vertex we first take a random index and then take 
an element from the corresponding array with this index. 

4.3 Properties 

It is easy to check that the parameters ctk,i from |7| and A from |2| are related 
in the following way: 



This means that we can get any value of A in [0, 1] and any power-law exponent 
7 e (2, oo). We also obtain D = J2 ^ a fc,/ • 

In the next section we analyze some properties of graphs in the polynomial 
model. We generate polynomial graphs and compare their properties with theo- 
retical results we obtain. 
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5 Experiments 



In this section, we choose a three-parameter model from the family of polynomial 
graph models defined in Section [4] and analyze the properties of the generated 
graphs depending on the parameters. 



5.1 Description of Empirically Studied Polynomial Model 

We studied empirically graphs in the polynomial model with m — 2p and the 
polynomial 



n 

fe=i 



v 



(mn) 2 mn (n + l) 2 



Here we need a,/3,S > and a + /3 + S = 1, therefore, we have three independent 
model parameters: m, a, and (3. 

From (fsj) we_obtain that in this model A — a + B — m{5 — a), therefore, 
the parameter of the degree distribution equals 7 = 1+ 2 a+/3 ■ 



due to Theorem 



For D from ( 6 1 we have D — pj3 — ^ . Using Theorem [(3j we get 

C(n) ~ 3(1 -2a -13)13 

[ '~ 5m-l-2(2m-l)(2a + /3) ' 

5.2 Empirical Results 

Degree Distribution and Clustering Coefficient. We studied two polyno- 
mial graphs with n — 10 7 , m = 2, and A = 0.2, putting a = 0.2, /3 = for the 
first graph and a — 0, (3 = 0.4 for the second one. The observed degree distribu- 
tions are almost identical and follow the power law with the expected parameter 
7 = 3.5, see Fig. [I] a). 

For both cases, we also studied the behaviour of the clustering coefficient 
of generated graphs, 40 samples for each n — [io 1+0 064 ] , % = 0, . . . , 100 - 
see Fig. b). In the first case we observe C(n) —> and in the second one 
C(n) — > j^, as was expected due to 

Assortativity. In this section, we consider so-called assortative mixing (or de- 
gree correlations) in the polynomial model (see, e.g., [412] ). One of the possible 
definitions is the following. For an undirected graph G, consider the average 
degree of the nearest neighbors of vertices with a given degree d : 

dnn{d) = dm £ E d ^ 

V ' i:d(i)=d j:ij£E(G) 

where E(G) is the set of edges of graph G. If vertices of high degree tend to 
connect with vertices of high degree in a network, then d nn (d) is an increasing 
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Fig. 1. a) The degree distribution of polynomial graphs with n = 10 7 and m — 2; 
b) The clustering coefficient of polynomial graphs with m = 2 depending on n. 



function (the case of assortative network). Vice- versa, if vertices of high degree 
tend to connect with vertices of low degree, then d nn {d) decreases (dis assortative 
case). 

As was mentioned in [18] and [19], in the Barabasi-Albert model d nn (d) s» 
const. In real- word networks, d nn (d) ~ d s with some 8 (see [IS]). Internet and 
WWW are disassortative networks ([H]) an< i social networks are usually assor- 
tative. Despite the results of [TH] on the asssortativity of preferential attachment 
models, our experiments show that even the Buckley-Osthus model may possess 
assortativity (for A < 1/2) or disassortativity (for A > 1/2). 

Fig. [2] a) shows the assortativity of the polynomial model with a = 0.2, j3 = 
and a = 0,/3 = 0.4. In both cases A = 0.2 and we obtain 8 « 0.41. Figure [2] 
b) shows the disassortativity of the polynomial model with a = 0.8, j3 = and 
a = 0.6, /3 = 0.4. In both cases A = 0.8 and we obtain 5 « —0.8. 



Comparison with Other Models. The following table summarizes our re- 
sults for the polynomial model in comparison with other mentioned preferential 
attachment models: 
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Clustering coefficient 


Assortativity 


BA 


3 


tends to zero 


no assortativity 


BO/M6ri 


(2, co) 


tends to zero 


assortative for /3 > 
disassortative for /3 < 


HK 


3 


constant 


? 


Polynomial 


(2, co) 


constant for A < | 


assortative for A < | 
disassortative for A > | 
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6 Conclusions 



In this paper, we introduced the PA-class of random graph models, which gen- 
eralizes previous preferential attachment approaches. We proved that any model 
from the PA-class possess the power-law degree distribution with tunable expo- 
nent. We also estimated its clustering coefficient. Next, we described one partic- 
ular model from the proposed class (with tunable both the degree distribution 
parameter and the clustering coefficient) and provided a linear algorithm, which 
generates graphs in this model. Experiments with generated graphs verify our 
theoretical results. Moreover, the study of assortative mixing in generated graphs 
leads to future questions and hypotheses not yet justified by theory. 

As the degree distribution of a preferential attachment model permits an ad- 
justment to reality, the clustering coefficient still gives rise to a problem in some 
cases. For most real- world networks the parameter 7 of their degree distribution 
belongs to [2, 3]. As we showed in Section [3j once 7 < 3 in a preferential attach- 
ment model, the clustering coefficient decreases as the graph grows, which does 
not response to the majority of real-world networks. 

Fortunately, there are many ways to overcome this obstacle. Cooper and 
Pralat proposed a modification of the Barabasi-Albert model, where a new ver- 
tex added at time t generates t c edges [TT] . Preferential attachment models with 
random initial degrees were considered in . Also there are models with adding 
edges between already existing nodes (e.g., [TO])- Using one of these ideas for the 
PA-class is a topic for future research. 
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Appendix: Proofs 



Proof of Theorem E] 

We need the following notation: 

P(^ = dK = d)=l-A*-Bl+o(^) , (10) 
pl{d):=V{dr 1 =d+l\d- = d)=A^ + B l -+o(^j , (11) 
p> n (d) := P « +1 = d + jK = d ) = O ( 4 V 2 < i < m ■ (12) 



p„:=f;P«+l = m + *) = o(~) . 
fe=i ^ ' 



(13) 



Note that the remainder term of Pn{d) can depend on i. We omit i in notation 
plaid) for simplicity of proofs. 

Put ft (d) = J2 , 3 r LiPl( d )- N o te that afe^P<(d- 1) = l + O 

We use this equality several times in this proof. 

The proof is by induction on d and then on i. We use the following equalities 

E(N i + 1 (m)\N i (m)) = N i (m){l-p i (m)) + l-p i , (14) 



EiN i+1 id)\Niid), N,id - 1), . . .,Ni(d - m)) = N^d) (1 - p,-(d)) + 

+ JV<(d - - 1) + ^ W - - i) + °(p0 • ( 15 ) 

i=2 

Consider the case d = m. For constant number of small i we obviously have 
ENiim) = A Zr+i + 0(Ci) with some Ci. Assume that EiVj(m) = AmH ! B+1 + 
0(Ci). From (bp we obtain 



EiV l+1 (m) = E7V,(m) (1 - Pi (m)) + 1-JH 



Am + B + 1 



+ 0(Ci) (l- ft (m)) + l + e(C 2 /i) 



i + 1 



Am + 5 + 1 
It remains to show that 



+ 0(C 1 )(l-p i (m)) + , 



C 3 



i 7 Am + 5 + 1 



0(C 2 /i) 



CiPi(m) > 



C 3 



z(Ato + B + 1) 



0(C 2 /i) 
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Wc have pi{m) > mA + B -j9-.lt gives us 

This equality holds for big i and C\. This completes the proof for d = m. 

Consider d > m and assume that we can prove the theorem for all smaller 
degrees. We use induction on i. We have Ni(d) < ^p, therefore Ni(d) = 
O (ic{m, d)d 1 / A ) . In particular, for i = O (d 2 ) we have ENi(d) = c(m, d) (i + 9 (Cd s )) 
with some C. Assume that 

ENi(d) = c(m,d) (i + e(Cd s )) . 



From ( 15 ) we obtain 

m 

EN i+1 {d) = ENi(d) (l- Pi (d))+EN i (d-l)p}(d-l)+£ i ENi(d-j)pi(d-j)+0(pi) 

= c(m, d)(i + 6 (Cd 5 )) (1 - Pi (d)) + 

'C 4 c(m,d)d 2 id 1 / A \ 



+ C (m, d-i)(i + e (c(d - i) s ))p]{d -i) + er- 

= c(m, d)(i + 1) + c(m, d — l)ip\{d — 1) — 
— c(m, d)ipi(d) — c(m, d) + c(m, d)0 (Ccf 5 ) (1 — Pi(d)) + 

cjm^M + B + l) 5 fW^) 

Ad - A + B v v i i fi\ i y 2 J 

— c{m, d)(i + 1) + c(m, d)0 (Cd s ) (1 - Pi (d)) + 

+ c{m ii (Ad A + + B B +1) ° (c(d - do a (* ( Csc(m 'f 2dlM 

We need to prove that there exists a constant C that 

Oi'ftW) > [ Ad _ A + B ' (d-lf P Ud-l) + — - , 

Cd* ft (d) > C{ Ad d + A B + B l) (d S - Sd s -i + C,d s - 2 ) p\{d - l) + , 

— > — + — . 

This inequality holds for big C and d. For constant number of small d we need 
to show that there exists a function f(d) > such that 

f(d)d s Pi (d) > f(d- l)^±|±i(d-l)'p}(d_ 1) + . 
Obviously this function exists. This concludes the proof. 
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Proof of Theorem [3] 

To prove Theorem [3] we need the Azuma-Hoeffding inequality: 

Theorem 7 (Azuma, Hoeffding). Let (Xj)JL be a martingale such that |Xj — 
< c for any 1 < i < n. Then 

P(\X n -X Q \>x)<2e-^ 

for any x > 0. 

A-S 

Suppose we are given some 5 > 0. Fix n and d: 1 < d < ni- A + 2 . Consider the 
random variables Xi(d) — E(A„((i)|G5 n ), i = 0, .. . ,n. 

Let us explain the notation E(N n (d)\Gl n ) . Denote by (3^ the probability 
space of graphs we obtain after n-th step of the process. We construct the graph 
GJ^ G by induction. For any t < n there exists a unique G l m £ C5*„ such that 
GJ^ is obtained from G*„. So E(N n (d)\G t m ) is the expectation of the number of 
vertices with degree d in G 7 7 l n if at the step t we have the graph G^. Note that 
X {d) = EN n (d) and X n (d) = N n {d). From the definition of G™ it follows that 
X n (d) is a martingale. 

We will prove below that for any i = 0, . . . , n — 1 

|X i+ i(d)-Xi(d)| <Md, 

where M > is some constant. Theorem follows from this statement immedi- 
ately. Put c = Md. Then from Azuma-Hoeffding inequality it follows that 

P (\N n (d) - EN n (d)\ >dV^\ogn)< 2exp = 

If d < n iA + 2 , then the value of 31^173 is considerably greater than d log n^jn. 
This is exactly what we need. 

It remains to estimate the quantity \X i+ i(d) — Xi(d)\. The proof is by a direct 
calculation. 

Fix < i < n — 1 and some graph G) n . Note that 

lE^^IG^-E^^IG^t 1 )! < 

< .max {e (N n {d)\G l + 1 )} - min {e (iV„(d)|Gj+ 1 ) } . 

Put Gj+ X - argmaxE(A^„(d)|G5+ 1 ), &+ 1 = argmin E(JV n (d)|Gj+ 1 ). We 
need to estimate the difference E(N n {d)\G i + 1 ) - E(N n (d)\G l + 1 ). 
For i + 1 < t < n put 

6 t (d) = E(N t (d)\G% 1 )-E(N t (d)\G% 1 ). 

First suppose that n = i + 1. Fix G l m . Graphs GJ+ 1 and GJ+ 1 are obtained 
from the graph G l m by adding the vertex i + 1 and m edges. Therefore 5^+1 (d) < 
2m. 
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Now consider t: i < t < n — 1. Note that 

E (N t+1 (m)\Gl n ) = E (JV t (m)|<4) (1 - Pt (m)) + 1 + 0(l/t) , 

E (JV t+1 (d)|C4) = E (iV t (d)|G l m ) (1 - Pt (d)) + 
m 

+ E (JV t (d - E (N t (d - 3 )\G l m ) p>(d- 3 )+0(l/t), d > m+1 . 

We obtained the same equalities in the proof of Theorem [2J Replace G l m by 
G l m or G l m in these equalities. Substracting the equalities with G l m from the 
equalities with G l m we get (for d > m) 



S t+1 (d) = S t (d) (l-pt(d)) + S t (d- l) P \(d - 1) + O ( 



£N t {d)d 2 
t 2 



S t (d) (1 - Pt (d)) + 5 t (d- l)p\{d - 1) + O (* 



From this recurrent relation it is easy to obtain that S n (d) < Md for some M. 
This concludes the proof of Theorem [3] 

Proof of Theorem [4] 

Let us give the sketch of the proof of Theorem [4] We can prove this theorem by 
induction. Note that 



Mn) = J2N n (d) d -^± 



Therefore 



EP 2 (*+1) = J2 EN i+1 (d)^j^ = EP 2 (»)+ m(m 2 1} + f] EJVi(d)ft(d)d' 

d—m d—m 

EP 2 (i) [ m[m - l) i jr ( Ad + B ) dEN ^ d ) = E p 2 (j) (V' 2 ^ ■ m (™-!) 



2 ^ i \ i J 2 

d—m 



So we obtain 



EP 2 (n)~ (2m(A + B) + -^- M ]T [J ( 1 + T ) 



t=i i=t+i 

„ /> „x m(m-l)\ An 2 ' 1 
2m(A + B)+ - 

/ t=l 



1G 



If 2A < 1 then 

^ „ / s ( , . „s m(m — 1) \ n 
Eft(„)~^ + B) + -4-^jT^2A- 

If 2A = 1 then 

EP 2 (n) ~ Um{A + B) + H^H_A^ nlog(n) . 

If 2A > 1 then 

EP 2 (n) = O (n 2A ) . 

We computed the expectation of Pi- One can prove concentration using stan- 
dard martingale methods. This completes the proof of Theorem [4] 
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